智能论文笔记

Gaussian Kernel Mixture Network for Single Image Defocus Deblurring

Yuhui Quan , Zicong Wu , Hui Ji

分类：计算机视觉

2021-10-31

散焦模糊是图像中经常看到的一种模糊效果，这是由于其空间变体的量而挑战。本文介绍了一种用于从单个图像中移除散焦模糊的端到端深度学习方法，以便具有随之而来的视觉任务的全焦点图像。首先，提出了一种用于以有效的线性参数形式表示空间变体散焦模糊核的像素 - WISE高斯核混合物（GKM）模型，其比现有模型更高。然后，通过展开基于GKM的去纹理的定点迭代来开发称为GKMNet的深神经网络。 GKMNET构建在轻量级刻度 - 重复间体系结构上，具有比例 - 复制注意力模块，用于估计GKM中的混合系数用于散焦去孔。广泛的实验表明，GKMNET不仅明显优于现有的散焦去纹理方法，而且还具有其在模型复杂性和计算效率方面的优势。

translated by 谷歌翻译

Lifting-wing Quadcopter Modeling and Unified Control

Quan Quan , Wang Shuai , Gao Wenhan

分类：机器人

2023-01-02

Hybrid unmanned aerial vehicles (UAVs) integrate the efficient forward flight of fixed-wing and vertical takeoff and landing (VTOL) capabilities of multicopter UAVs. This paper presents the modeling, control and simulation of a new type of hybrid micro-small UAVs, coined as lifting-wing quadcopters. The airframe orientation of the lifting wing needs to tilt a specific angle often within $ 45$ degrees, neither nearly $ 90$ nor approximately $ 0$ degrees. Compared with some convertiplane and tail-sitter UAVs, the lifting-wing quadcopter has a highly reliable structure, robust wind resistance, low cruise speed and reliable transition flight, making it potential to work fully-autonomous outdoor or some confined airspace indoor. In the modeling part, forces and moments generated by both lifting wing and rotors are considered. Based on the established model, a unified controller for the full flight phase is designed. The controller has the capability of uniformly treating the hovering and forward flight, and enables a continuous transition between two modes, depending on the velocity command. What is more, by taking rotor thrust and aerodynamic force under consideration simultaneously, a control allocation based on optimization is utilized to realize cooperative control for energy saving. Finally, comprehensive Hardware-In-the-Loop (HIL) simulations are performed to verify the advantages of the designed aircraft and the proposed controller.

translated by 谷歌翻译

Refined Edge Usage of Graph Neural Networks for Edge Prediction

Jiarui Jin , Yangkun Wang , Weinan Zhang , Quan Gan , Xiang Song , Yong Yu , Zheng Zhang , David Wipf

分类：机器学习 | 人工智能

2022-12-25

Graph Neural Networks (GNNs), originally proposed for node classification, have also motivated many recent works on edge prediction (a.k.a., link prediction). However, existing methods lack elaborate design regarding the distinctions between two tasks that have been frequently overlooked: (i) edges only constitute the topology in the node classification task but can be used as both the topology and the supervisions (i.e., labels) in the edge prediction task; (ii) the node classification makes prediction over each individual node, while the edge prediction is determinated by each pair of nodes. To this end, we propose a novel edge prediction paradigm named Edge-aware Message PassIng neuRal nEtworks (EMPIRE). Concretely, we first introduce an edge splitting technique to specify use of each edge where each edge is solely used as either the topology or the supervision (named as topology edge or supervision edge). We then develop a new message passing mechanism that generates the messages to source nodes (through topology edges) being aware of target nodes (through supervision edges). In order to emphasize the differences between pairs connected by supervision edges and pairs unconnected, we further weight the messages to highlight the relative ones that can reflect the differences. In addition, we design a novel negative node-pair sampling trick that efficiently samples 'hard' negative instances in the supervision instances, and can significantly improve the performance. Experimental results verify that the proposed method can significantly outperform existing state-of-the-art models regarding the edge prediction task on multiple homogeneous and heterogeneous graph datasets.

translated by 谷歌翻译

Differential Flatness of Lifting-Wing Quadcopters Subject to Drag and Lift for Accurate Tracking

Shuai Wang , Wenhan Gao , Quan Quan

分类：机器人

2022-12-25

In this paper, we propose an effective unified control law for accurately tracking agile trajectories for lifting-wing quadcopters with different installation angles, which have the capability of vertical takeoff and landing (VTOL) as well as high-speed cruise flight. First, we derive a differential flatness transform for the lifting-wing dynamics with a nonlinear model under coordinated turn condition. To increase the tracking performance on agile trajectories, the proposed controller incorporates the state and input variables calculated from differential flatness as feedforward. In particular, the jerk, the 3-order derivative of the trajectory, is converted into angular velocity as a feedforward item, which significantly improves the system bandwidth. At the same time, feedback and feedforward outputs are combined to deal with external disturbances and model mismatch. The control algorithm has been thoroughly evaluated in the outdoor flight tests, which show that it can achieve accurate trajectory tracking.

translated by 谷歌翻译

Distributed Control within a Trapezoid Virtual Tube Containing Obstacles for UAV Swarm Subject to Speed Constraints

Yan Gao , Chenggang Bai , Quan Quan

分类：机器人

2022-12-24

For guiding the UAV swarm to pass through narrow openings, a trapezoid virtual tube is designed in our previous work. In this paper, we generalize its application range to the condition that there exist obstacles inside the trapezoid virtual tube and UAVs have strict speed constraints. First, a distributed vector field controller is proposed for the trapezoid virtual tube with no obstacle inside. The relationship between the trapezoid virtual tube and the speed constraints is also presented. Then, a switching logic for the obstacle avoidance is put forward. The key point is to divide the trapezoid virtual tube containing obstacles into several sub trapezoid virtual tubes with no obstacle inside. Formal analyses and proofs are made to show that all UAVs are able to pass through the trapezoid virtual tube safely. Besides, the effectiveness of the proposed method is validated by numerical simulations and real experiments.

translated by 谷歌翻译

Empirical Analysis of AI-based Energy Management in Electric Vehicles: A Case Study on Reinforcement Learning

Jincheng Hu , Yang Lin , Jihao Li , Zhuoran Hou , Dezong Zhao , Quan Zhou , Jingjing Jiang , Yuanjian Zhang

分类：人工智能 | 机器学习

2022-12-18

Reinforcement learning-based (RL-based) energy management strategy (EMS) is considered a promising solution for the energy management of electric vehicles with multiple power sources. It has been shown to outperform conventional methods in energy management problems regarding energy-saving and real-time performance. However, previous studies have not systematically examined the essential elements of RL-based EMS. This paper presents an empirical analysis of RL-based EMS in a Plug-in Hybrid Electric Vehicle (PHEV) and Fuel Cell Electric Vehicle (FCEV). The empirical analysis is developed in four aspects: algorithm, perception and decision granularity, hyperparameters, and reward function. The results show that the Off-policy algorithm effectively develops a more fuel-efficient solution within the complete driving cycle compared with other algorithms. Improving the perception and decision granularity does not produce a more desirable energy-saving solution but better balances battery power and fuel consumption. The equivalent energy optimization objective based on the instantaneous state of charge (SOC) variation is parameter sensitive and can help RL-EMSs to achieve more efficient energy-cost strategies.

translated by 谷歌翻译

RT-1: Robotics Transformer for Real-World Control at Scale

Anthony Brohan , Noah Brown , Justice Carbajal , Yevgen Chebotar , Joseph Dabis , Chelsea Finn , Keerthana Gopalakrishnan , Karol Hausman , Alex Herzog , Jasmine Hsu

分类：机器人 | 人工智能 | 自然语言处理 | 计算机视觉 | 机器学习

2022-12-13

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data. We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data. In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks. The project's website and videos can be found at robotics-transformer.github.io

translated by 谷歌翻译

Web-based Experiment on Human Performance in Dual-Robot Teleoperation

Yuhui Wan , Chengxu Zhou

分类：机器人

2022-12-13

In most cases, upgrading from a single-robot system to a multi-robot system comes with increases in system payload and task performance. On the other hand, many multi-robot systems in open environments still rely on teleoperation. Therefore, human performance can be the bottleneck in a teleoperated multi-robot system. Based on this idea, the multi-robot system's shared autonomy and control methods are emerging research areas in open environment robot operations. However, the question remains: how much does the bottleneck of the human agent impact the system performance in a multi-robot system? This research tries to explore the question through the performance comparison of teleoperating a single-robot system and a dual-robot system in a box-pushing task. This robot teleoperation experiment on human agents employs a web-based environment to simulate the robots' two-dimensional movement. The result provides evidence of the hardship for a single human when teleoperating with more than one robot, which indicates the necessity of shared autonomy in multi-robot systems.

translated by 谷歌翻译

PyPop7: A Pure-Python Library for Population-Based Black-Box Optimization

Qiqi Duan , Guochen Zhou , Chang Shao , Zhuowei Wang , Mingyang Feng , Yijun Yang , Qi Zhao , Yuhui Shi

分类：神经与进化计算

2022-12-12

In this paper, we present a pure-Python open-source library, called PyPop7, for black-box optimization (BBO). It provides a unified and modular interface for more than 60 versions and variants of different black-box optimization algorithms, particularly population-based optimizers, which can be classified into 12 popular families: Evolution Strategies (ES), Natural Evolution Strategies (NES), Estimation of Distribution Algorithms (EDA), Cross-Entropy Method (CEM), Differential Evolution (DE), Particle Swarm Optimizer (PSO), Cooperative Coevolution (CC), Simulated Annealing (SA), Genetic Algorithms (GA), Evolutionary Programming (EP), Pattern Search (PS), and Random Search (RS). It also provides many examples, interesting tutorials, and full-fledged API documentations. Through this new library, we expect to provide a well-designed platform for benchmarking of optimizers and promote their real-world applications, especially for large-scale BBO. Its source code and documentations are available at https://github.com/Evolutionary-Intelligence/pypop and https://pypop.readthedocs.io/en/latest, respectively.

translated by 谷歌翻译

Tencent AVS: A Holistic Ads Video Dataset for Multi-modal Scene Segmentation

Jie Jiang , Zhimin Li , Jiangfeng Xiong , Rongwei Quan , Qinglin Lu , Wei Liu

分类：计算机视觉

2022-12-09

Temporal video segmentation and classification have been advanced greatly by public benchmarks in recent years. However, such research still mainly focuses on human actions, failing to describe videos in a holistic view. In addition, previous research tends to pay much attention to visual information yet ignores the multi-modal nature of videos. To fill this gap, we construct the Tencent `Ads Video Segmentation'~(TAVS) dataset in the ads domain to escalate multi-modal video analysis to a new level. TAVS describes videos from three independent perspectives as `presentation form', `place', and `style', and contains rich multi-modal information such as video, audio, and text. TAVS is organized hierarchically in semantic aspects for comprehensive temporal video segmentation with three levels of categories for multi-label classification, e.g., `place' - `working place' - `office'. Therefore, TAVS is distinguished from previous temporal segmentation datasets due to its multi-modal information, holistic view of categories, and hierarchical granularities. It includes 12,000 videos, 82 classes, 33,900 segments, 121,100 shots, and 168,500 labels. Accompanied with TAVS, we also present a strong multi-modal video segmentation baseline coupled with multi-label class prediction. Extensive experiments are conducted to evaluate our proposed method as well as existing representative methods to reveal key challenges of our dataset TAVS.

translated by 谷歌翻译